AITopics | speech database

Collaborating Authors

speech database

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

PRODIS -- a speech database and a phoneme-based language model for the study of predictability effects in Polish

Malisz, Zofia, Foremski, Jan, Kul, Małgorzata

arXiv.org Artificial IntelligenceApr-15-2024

We present a speech database and a phoneme-level language model of Polish. The database and model are designed for the analysis of prosodic and discourse factors and their impact on acoustic parameters in interaction with predictability effects. The database is also the first large, publicly available Polish speech corpus of excellent acoustic quality that can be used for phonetic analysis and training of multi-speaker speech technology systems. The speech in the database is processed in a pipeline that achieves a 90% degree of automation. It incorporates state-of-the-art, freely available tools enabling database expansion or adaptation to additional languages.

language model, speech, speech database, (15 more...)

arXiv.org Artificial Intelligence

2404.10112

Country:

Europe > Poland > Greater Poland Province > Poznań (0.05)
Europe > Sweden > Stockholm > Stockholm (0.05)
North America > United States > New Jersey > Hudson County > Hoboken (0.04)
(2 more...)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Natural Language (1.00)

Add feedback

Construction and Evaluation of Mandarin Multimodal Emotional Speech Database

Ting, Zhu, Liangqi, Li, Shufei, Duan, Xueying, Zhang, Zhongzhe, Xiao, Hairng, Jia, Huizhi, Liang

arXiv.org Artificial IntelligenceJan-14-2024

A multi-modal emotional speech Mandarin database including articulatory kinematics, acoustics, glottal and facial micro-expressions is designed and established, which is described in detail from the aspects of corpus design, subject selection, recording details and data processing. Where signals are labeled with discrete emotion labels (neutral, happy, pleasant, indifferent, angry, sad, grief) and dimensional emotion labels (pleasure, arousal, dominance). In this paper, the validity of dimension annotation is verified by statistical analysis of dimension annotation data. The SCL-90 scale data of annotators are verified and combined with PAD annotation data for analysis, so as to explore the internal relationship between the outlier phenomenon in annotation and the psychological state of annotators. In order to verify the speech quality and emotion discrimination of the database, this paper uses 3 basic models of SVM, CNN and DNN to calculate the recognition rate of these seven emotions. The results show that the average recognition rate of seven emotions is about 82% when using acoustic data alone. When using glottal data alone, the average recognition rate is about 72%. Using kinematics data alone, the average recognition rate also reaches 55.7%. Therefore, the database is of high quality and can be used as an important source for speech analysis research, especially for the task of multimodal emotional speech analysis.

annotator, database, emotion, (16 more...)

arXiv.org Artificial Intelligence

2401.07336

Country:

Asia > China > Guangdong Province > Guangzhou (0.04)
Europe > United Kingdom > England > East Sussex > Brighton (0.04)
Europe > Switzerland > Vaud > Lausanne (0.04)
(5 more...)

Genre: Research Report > New Finding (0.48)

Industry:

Health & Medicine > Therapeutic Area > Psychiatry/Psychology (1.00)
Education (0.93)

Technology:

Information Technology > Artificial Intelligence > Cognitive Science > Emotion (1.00)
Information Technology > Data Science (0.67)
Information Technology > Artificial Intelligence > Speech (0.67)
(2 more...)

Add feedback

Annotated Speech Corpus for Low Resource Indian Languages: Awadhi, Bhojpuri, Braj and Magahi

Kumar, Ritesh, Singh, Siddharth, Ratan, Shyam, Raj, Mohit, Sinha, Sonal, Lahiri, Bornini, Seshadri, Vivek, Bali, Kalika, Ojha, Atul Kr.

arXiv.org Artificial IntelligenceJun-26-2022

In this paper we discuss an in-progress work on the development of a speech corpus for four low-resource Indo-Aryan languages -- Awadhi, Bhojpuri, Braj and Magahi using the field methods of linguistic data collection. The total size of the corpus currently stands at approximately 18 hours (approx. 4-5 hours each language) and it is transcribed and annotated with grammatical information such as part-of-speech tags, morphological features and Universal dependency relationships. We discuss our methodology for data collection in these languages, most of which was done in the middle of the COVID-19 pandemic, with one of the aims being to generate some additional income for low-income groups speaking these languages. In the paper, we also discuss the results of the baseline experiments for automatic speech recognition system in these languages.

data collection, dataset, hindi, (10 more...)

arXiv.org Artificial Intelligence

2206.12931

Country:

Asia > Indonesia > Bali (0.05)
Asia > India > West Bengal > Kharagpur (0.04)
North America > United States > New York > New York County > New York City (0.04)
(7 more...)

Genre: Research Report (0.41)

Technology:

Information Technology > Artificial Intelligence > Speech > Speech Recognition (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)

Add feedback

BEA-Base: A Benchmark for ASR of Spontaneous Hungarian

Mihajlik, P., Balog, A., Gráczi, T. E., Kohári, A., Tarján, B., Mády, K.

arXiv.org Artificial IntelligenceFeb-1-2022

Hungarian is spoken by 15 million people, still, easily accessible Automatic Speech Recognition (ASR) benchmark datasets - especially for spontaneous speech - have been practically unavailable. In this paper, we introduce BEA-Base, a subset of the BEA spoken Hungarian database comprising mostly spontaneous speech of 140 speakers. It is built specifically to assess ASR, primarily for conversational AI applications. After defining the speech recognition subsets and task, several baselines - including classic HMM-DNN hybrid and end-to-end approaches augmented by cross-language transfer learning - are developed using open-source toolkits. The best results obtained are based on multilingual self-supervised pretraining, achieving a 45% recognition error rate reduction as compared to the classical approach - without the application of an external language model or additional supervised data. The results show the feasibility of using BEA-Base for training and evaluation of Hungarian speech recognition systems.

database, experiment, speech, (16 more...)

arXiv.org Artificial Intelligence

2202.00601

Country:

Europe > Hungary > Budapest > Budapest (0.05)
Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.04)
Europe > Italy > Calabria > Catanzaro Province > Catanzaro (0.04)

Genre: Research Report > New Finding (0.66)

Technology:

Information Technology > Artificial Intelligence > Speech > Speech Recognition (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

What Is Natural Language Processing and How Does It Work? - Text2Speech Blog

#artificialintelligenceMay-19-2017, 05:55:37 GMT

In 1950, Alan Turing published his famous paper titled "Computing Machinery and Intelligence". The paper proposed a test to determine if a machine was artificially intelligent. Basically, Turing said that if a machine could have a conversation with a human and trick the human into thinking the machine was a person itself, then it was artificially intelligent. This became known as the Turing Test, and passing it has been one of the most sought after goals in computer science. Passing the Turing Test would signal the birth of artificial intelligence.

artificial intelligence, computer, natural language, (14 more...)

#artificialintelligence

Genre: Personal > Interview (0.35)

Industry: Information Technology (0.49)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Issues > Turing's Test (0.76)

Add feedback